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ABSTRACT 


Second-order  sensitivity  analysis  methods  are  developed  for  analyzing  the 
behavior  of  a local  solution  to  a constrained  nonlinear  optimization  problem 
when  the  problem  functions  are  perturbed  slightly.  Specifically,  formulas 
involving  third-order  tensors  are  given  to  compute  second  derivatives  of 
components  of  the  local  solution  with  respect  to  the  problem  parameters.  When 
in  addition,  the  problem  functions  are  factorable,  it  is  shown  that  the 
resulting  tensors  are  polyadic  in  nature. 

Keywords:  Second-order  sensitivity  analysis,  high-order  methods,  Nth 

derivatives,  polyads,  tensors. 
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1.  INTRODUCTION 


In  an  earlier  paper  (Jackson  and  McCormick  (1984))  the  structure  taken  by 
N-dimensional  arrays  of  Nth  partial  derivatives  of  the  special  class  of 
factorable  functions  was  examined.  The  N-dimensional  arrays  (or  tensors  as 
they  are  sometimes  called)  turn  out  to  be  computable  naturally  as  the  sum  of 
generalized  outer  product  matrices  (polyads).  For  the  benefit  of  the  reader 
unfamiliar  with  polyads  and  factorable  functions,  some  of  the  material  in  that 
paper  is  repeated  here. 

This  natural  polyadic  structure  has  important  computational  implications 
for  solving  problems  associated  with  nonlinear  programming.  It  means  for 
example  that  with  some  minor  modification  to  existing  software  routines,  high- 
order  derivatives  can  be  calculated  efficiently,  making  previously  intractable 
techniques  that  require  them,  again  worthy  of  consideration.  In  the  disser- 
tation by  Jackson  (1983)  from  which  most  of  the  material  in  this  paper  is 
taken,  these  implications  were  pursued  for  second-order  sensitivity  analysis 
and  high-order  methods  for  solving  the  problem: 

minimize  f(x), 
x e Rn 


(1.1) 


subject  to  g(x)  > 0, 

for  i-1,  ...,  m,  when  f(x)  and  g(x)  are  factorable  functions. 

The  ability  to  compute  third  derivatives  efficiently  provides  ready 
access  to  second-order  nonlinear  programming  sensitivity  information.  In 
Section  3 of  this  paper,  second-order  sensitivity  analysis  methods  are  devel- 
oped for  analyzing  the  behavior  of  a local  solution  to  (1.1)  when  the  problem 
functions  are  perturbed  slightly.  Section  3 begins  by  summarizing  results 
from  first-order  sensitivity  analysis  which  provide  formulas  for  the  first 


derivatives  of  the  components  of  the  local  solution  with  respect  to  the 
problem  parameters.  Also  developed  are  formulas,  iravolving  third-order 
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tensors,  for  computing  the  second  derivatives  of  the  local  solution  with 
respect  to  the  problem  parameters.  In  addition,  the  polyadic  structure  of  the 
tensors  is  investigated  and  displayed,  and  techniques  for  manipulating  these 
three-dimensional  arrays,  capitalizing  on  this  special  structure,  are 
developed.  In  general,  this  type  of  array  manipulation  is  straightforward  but 
time-consuming  and  requires  significant  computer  storage.  It  is  shown  that 
these  difficulties  are  ameliorated  when  the  special  structure  of  factorable 
functions  is  exploited.  Examples  of  the  use  of  these  formulas  for  estimating 
the  solution  to  perturbed  problems  using  Taylor  series  approximations  are 
also  given. 

Loosely,  a factorable  function  is  a multivariable  function  that  can  be 
written  as  the  last  of  a finite  sequence  of  functions,  in  which  the  first  n 
functions  in  the  sequence  are  just  the  coordinate  variables,  and  each  function 
beyond  the  n^  is  a sum,  a product,  or  a single -variable  transformation  of 
previous  functions  in  the  sequence.  More  rigorously,  let  [fj(x),  f2(x),  ..., 
fx,(x)]  t>e  a finite  sequence  of  functions  such  that  f£:Rn  + R,  where  each 
fi(x)  is  defined  according  to  one  of  the  following  rules. 

Rule  1 . For  1*1,  ...,  n,  f^(x)  is  the  value  of  the  i ^ Euclidean  coordinate: 

f^x)  * X£. 

Rule  2 . For  i*n+l,  ...»  L,  fi(x)  is  formed  using  one  of  the  following  compos- 
itions: 

a. )  f i (x ) - fj(i)(x)  + fk(i)(x);  or 

b. )  f i(x ) - fj(i)(x)  • fk(i)(x);  or  (1.2) 

c. )  f i(x)  * T i [f  j(i)(x)] ; 

where  j(i)  < i,  k(i)  < i,  and  is  a function  of  a single  variable.  Then 
f(x)  » fL,(x)  is  a factorable  function  and  [fi(x),  f2(x),  ...,  fk(x)]  is  a 
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factored  sequence.  Thus  a function,  f(x),  will  be  called  factorable  if  it  can 
be  formed  according  to  Rules  1 and  2,  and  the  resulting  sequence  of  functions 
will  be  called  a factored  sequence  or,  at  times,  the  function  written  in 
factored  form. 

Although  it  is  not  always  immediately  grasped,  the  concept  of  a factor- 
able function  is  actually  a very  natural  one.  In  fact,  it  is  just  a 

formalization  of  the  natural  procedure  one  follows  in  evaluating  a complicated 
function.  Consider  for  example  the  function 

f(x)  = a^x [sinb^x]  [expcTx]  , (1.3) 

where  a,b,c  and  x are  (2  x 1)  vectors.  The  natural  approach  to  evaluating 
this  function  for  specific  values  of  x^  and  X2  is  first  to  compute  the  quanti- 
ties within  the  parentheses,  then  to  apply  the  sine  and  exponential  functions, 

and  finally  to  multiply  the  three  resulting  quantities.  This  might  be  done  in 
stages  as  follows. 


fl  * X1 

f8  ■ f6  + f 7 

f 2 * x2 

f9  - cifi 

f3  - aifi 

f10 

- C2f2 

f4  - a2f 2 

*11 

* f9  + flO 

f5  - f3  + f4 

f12 

* sin(fg) 

f6  * bifi 

f 13 

* exp(fn) 

f7  - b2f 2 

f 14 

" f5  * f12 

f 15  » f 13  * fi4 

9 

This  is  one  possible  factored  sequence  for  the  function  in  (1.3). 
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For  completeness,  we  also  note  here  that  a Factorable  Programming 
problem  may  be  compactly  (though  less  transparently)  written  as 

minimize  f^x) 
x e Rn 

subject  to  < fi(x)  < U£, 

for  i = 1 , L-l,  where  it  is  possible  that  l±  = —»  and/or  ui  * +00,  and 

where  fj.(x)  » xi,  for  i < n,  and  ff(x)  is  defined  recursively  for  i > n as 

i-1  i-1  p 

ft(x)  - ^ Up[fp(x)l  + ^ ^ V^p[fp(x)J  Wp_q[f,(x)l, 

A 

where  U,  V,  and  W are  functions  of  a single  variable. 

In  order  to  appreciate  fully  the  value  of  factorable  functions  the  con- 
cept of  an  outer  product  matrix  mist  be  introduced.  An  (m  x n)  matrix  A is 
called  an  outer  product  matrix  if  there  exists  a scalar  a,  an  (m  x 1)  vector 
a,  and  an  (n  x 1)  vector  b such  that 

A ■ aabX . 

The  expression  aab£  is  called  an  outer  product  or  a dyad.  Note  that  a dyad 
is  conformable  since  the  dimensions  of  the  product  are  (m  x 1 ) ( 1 x 1 ) ( 1 x n), 
which  yields  the  (m  x n)  outer  product  matrix  A as  desired.  A useful  property 
of  outer  product  matrices  is  that,  if  kept  as  dyads,  matrix  multiplication 
with  them  is  simplified  to  inner  products  alone,  saving  the  computations 
required  to  form  the  matrices  involved.  For  example, 

Ac  » aa[b^c]  , 
d^A  =*  [d^-ajab^,  and 
AF  »aa[bTF], 

where  c is  (n  x 1) , d is  (m  x 1)  and  F is  (n  x m). 
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It  is  well-known  (see  for  instance  McCormick  (1983))  that  factorable 
functions  possess  two  very  special  properties  that  can  be  exploited  to  produce 
efficient  (fast  and  accurate)  algorithms:  i)  once  written  in  factorable  form, 

their  gradients  and  Hessians  may  be  computed  exactly,  automatically,  and 
efficiently;  and  ii)  their  Hessians  occur  naturally  as  sums  of  dyads  whose 
vector  factors  are  gradients  of  terms  in  the  factored  sequence.  The  first  of 
these  properties  has  eased  the  task  of  providing  the  derivatives  of  a 
nonlinear  programming  problem  to  a computer  code  that  solves  it,  and  has  the 
potential  eventually  to  trivialize  it.  The  second  has  obviated  the  task  of 
multiplying  a matrix  by  a vector,  reducing  it  to  a series  of  inner  products, 
as  noted  above,  which  in  many  cases  results  In  less  effort. 

Since  the  discovery  of  factorable  functions,  the  theory  of  Factorable 
Programming  has  been  further  developed  and  refined  in  a variety  of  ways. 

Ghaemi  and  McCormick  (1970)  developed  a computer  code  (FACSUMT),  which 
processes  the  functions  in  a factorable  program  and  provides  the  interface  to 
the  SUMT  nonlinear  programming  code  developed  originally  by  My  lander  et_  al . 
(1971).  An  early  version  of  this  code  is  described  in  Pugh  (1972). 

Recently  the  routines  from  FACSUMT  that  process  the  problem  functions  in 
factorable  form  have  been  separated  out  into  a "stand-alone " package  (FACT IN) 
that  can  be  used  with  any  nonlinear  programming  system  to  provide  automati- 
cally the  values  of  the  functions,  gradients  and  Hessians  at  a point.  The 
basic  requirement  of  FACTIN  is  that  the  user  write  the  functions  of  the  prob- 
lem in  factorable  form. 

Once  the  problem  Is  written  in  factorable  form,  the  routines  in  FACTIN 
automatically  calculate  the  exact  first  and  second  derivatives  of  the 
functions  for  use  in  the  optimization  algorithm. 
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It  is  important  to  understand  that  the  derivative  calculations  performed 
by  the  FACTIN  code  are  not  estimations,  but  mathematically  exact  calculations. 
Furthermore,  they  are  also  compact,  since  factored  sequences  mimic  hand  calcu- 
lations, and  thus  this  technique  is  different  from  symbolic  manipulation  tech- 
niques for  differentiation,  which  tend  to  produce  large  amounts  of  code.  The 
techniques  used  in  Factorable  Programming  are  efficient  exploitations  of  the 
special  structure  inherent  in  factorable  functions  and  their  partial  deriva- 
tive arrays.  Moreover,  while  it  is  true  that  some  symbolic  dif f erentiaters 
also  can  recognize  functions  which  can  be  described  similarly  as  a sequence  of 
rules,  each  of  which  can  be  differentiated,  the  similarity  ends  there.  Such 
symbolic  dif f erentiaters  continue  to  differentiate  the  rules,  without  exploit- 
ing the  polyadic  structure  of  the  result.  See,e.g.,  Kedem  (1980),  Rail 
(1980),  Wengert  (1964),  Reiter  and  Gray  (1967),  Hillstrom  (1982),  and  Warner 
(1975).  It  is  this  latter  effort  which  provides  the  real  value  of  factorable 
functions,  and  which  therefore  separates  the  two  techniques.  More  on  Factor- 
able Programming  codes  and  the  Factorable  Programming  system  (FACTPROG)  under 
development  at  the  National  Bureau  of  Standards  is  given  in  Jackson  and 
McCormick  (1984). 

Further  extensions  of  Factorable  Programming  theory  were  provided  by 
Shayan  (1978),  who  developed  an  automatic  method  for  computing  the  m^-order 
directional  derivative  of  a factorable  function  and  noted  that  the  efficiency 
of  a solution  technique  can  be  evaluated  when  the  functions  are  factorable  by 
counting  basic  operations  and  basic  functions.  This  is  a more  accurate  mea- 
sure of  efficiency  than  the  technique  of  counting  the  number  of  "equivalent 
function  evaluations"  discussed  by  Miele  and  Gonzalez  (1978). 
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As  was  mentioned  earlier,  Hessians  of  factorable  functions  possess  a 
natural  dyadic  structure  which  can  be  exploited.  This  structure  was  used  in 
Emami  (1978)  to  develop  a matrix  factorization  scheme  for  obtaining  a general 
ized  inverse  of  the  Hessian  of  a factorable  function.  Ghotb  (1980)  also  capi 
talized  on  this  structure  and  provided  formulae  for  computing  a generalized 
inverse  of  a reduced  Hessian  when  it  is  given  in  dyadic  form.  Sofer  (1983) 
has  extended  this  last  concept  further,  by  utilizing  the  dyadic  structure  to 
obtain  computationally  efficient  techniques  for  constructing  a generalized 
inverse  of  a reduced  Hessian  and  updating  it  from  iteration  to  iteration. 

Another  direction  was  pursued  by  DeSilva  and  McCormick  (1978),  who 
developed  the  formulae  and  methodology  to  utilize  the  input  to  general 
nonlinear  programs  in  factorable  form  to  perform  first-order  sensitivity 
analysis  on  the  solution  vector. 

Some  last  comments  on  notation  are  required.  There  are  unavoidable 
complications  in  the  theory  that  follows  that  require  subscripted  subscripts. 
In  some  cases  these  are  used.  In  other  cases,  subscripted  subscripts  are 
replaced  with  subscript  functions.  For  example,  ij  i(j).  The  choice  in 
each  case  was  made  on  the  basis  of  clarity  of  resulting  formulae.  Also  In 
what  follows,  all  vectors  are  assumed  to  be  column  vectors,  and,  where  not 
otherwise  stated,  differentiation  is  with  respect  to  the  vector  x ■ (xj , X2, 
•••»  xn)T.  Lastly,  we  use  3 and  7 to  indicate  partial  differentiation,  and  d 
and  D indicate  total  differentiation. 
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2.  THE  SPECIAL  STRUCTURE  OF  TENSORS  OF  FACTORABLE  FUNCTIONS 


2.1  Background  and  Notation 

One  fundamental  value  of  factorable  functions  lies  in  the  simple  and 
computationally  efficient  forms  that  result  for  their  Hessians.  In  tact 
factorable  programming  is  based  on  the  existence  of,  and  the  simplified 
operations  that  result  from,  these  simple  forms.  The  seminal  result,  (Fiacco 
and  McCormick  (1968),  pp  184-188),  is  that  the  Hessian  of  a factorable 
function  can  be  written  as  the  sum  of  dyads,  or  outer  products,  of  gradients 
of  functions  in  the  factored  sequence.  As  was  shown  in  Jackson  and  McCormick 
(1984)  and  will  be  summarized  here,  this  basic  result  may  be  generalized,  but 
first  it  is  necesssary  to  generalize  the  concepts  of  Hessian  and  dyad. 

Let  A e R^  1 x X nN^  ? and  let  A^,...,!^  denote  the  (i^ , 1n)c^ 

element  of  this  array.  For  the  purposes  of  this  paper,  A is  called  the 

N^-order  tensor  of  a multivariable  function  f(x)  if 

N 

Ai !»•••»%  * 3 f(x)/3xiN...3xi1. 

Note  that  gradients  and  Hessians  are  tensors  of  order  1 and  2 respectively. 

An  N-dimensional  array  A is  called  a generalized  outer  product  matrix  if 
there  exists  a scalar  a,  and  an  ordered  set  of  vectors  a^ , ...,  sn  (where  each 
a^  is  (nk  x 1))  such  that  each  element  of  A is  generated  by  the  product  of  the 
scalar  a and  certain  specific  elements  of  the  vectors  aj , ...,  aN  as  follows 

Ailf...,iN  " a * al.*i  * *•*  * aN,%> 

for  ij.  ■ 1,  ...»  ni ; •••;  In  * 1*  •••»  njj,  where  ak,ik  represents  the  (iit)ch 
element  of  the  (n^  x 1)  vector  a^. 
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The  scalar  and  set  of  vectors  which  generate  a generalized  outer  product 


matrix  taken  together  are  called  a poly ad  and  are  written 


(a.-ai  •••  aN). 


(2.1) 


where  order  is  important,  i.e.  the  vector  in  position  j is  associated  with  the 
jth  dimension.  A polyad  containing  N vector  factors  is  called  an  N-ad.  Also, 
an  expression  containing  a sum  of  polyads  is  called  a polyadic,  and  an 
expression  containing  a sum  of  N-ads  is  called  an  N-adic.  (The  actual  addi- 
tion here  is  performed  as  a direct  sum  of  the  associated  generalized  outer 
product  matrices.)  When  vector  factors  in  a polyad  are  repeated,  exponential 
notation  is  used,  as,  e.g.,  in  the  case  of  the  symmetric  N-ad,  (a:[a]N).  Note 
that  the  representation  of  a generalized  outer  product  matrix  by  a polyad  is 
not  unique.  For  example,  (a/y:[aiY]  ***  afl)  generates  the  same  N-dimensional 
array  of  numbers  as  does  (2.1)  for  any  nonzero  scalar  Y . Finally,  note  that  a 
2-ad  of  the  form  (a:ab)  is  equivalent  to  the  more  familiar  dyad  of  the  form 
aab? , and  the  two  will  be  used  interchangeably. 

Because  the  concepts  of  generalized  outer  product  matrix  and  polyad  are 
so  fundamental  to  what  follows,  an  example  is  included  to  help  in 
understanding  them.  Consider  first  the  dyad  (10:a^a2)  where  ai  » (4, 3, 2,1)?, 
and  “ (3,2,1)?.  The  generalized  outer  product  matrix  of  order  2 generated 
by  this  dyad  is  the  (4  x 3)  array 


If  the  vector  a3  = (2,1)T  is  added  to  form  the  triad  (10:aia2a3),  the  result 
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is  a generalized  outer  product  matrix  of  order  3 with  dimensions  (4  x 3 x 2). 
To  form  this  three-dimensional  array,  the  outer  product  between  the  matrix 
above  and  the  vector  is  formed.  Thus  the  matrix  above  is  multiplied  by  2 
to  obtain  the  front  matrix  in  the  three-dimensional  array  and  by  1 to  obtain 
the  back  matrix.  These  are: 
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Front  Back 

2.2  The  First  and  Second  Order  Cases 
In  this  section,  the  special  polyadic  structure  of  the  gradient  and  the 
Hessian  of  a factorable  function  is  exhibited. 

Theorem  1 . (Monadic  Gradients) 

Let  f(x)  be  a factorable  function  in  Rn,  let  [f^(x),  f2(x),  ...,  f^(x)] 
be  a factored  sequence  for  f(x),  and  assume  that  all  functions  are  once  con- 
tinuously differentiable.  Then  the  gradient  Vf(x)  ■ VfL(x)  can  be  written  as 
a sum  of  outer  product  matrices  of  the  form  (a:ai),  where  a^  is  a gradient  of 
a f actored-sequence  function  and  the  scalar  a is  composed  of  a product  of 

f actored-sequence  functions  and  first  derivatives  of  the  single-variable 

■ 0 0 

transformations,  Tj.,  used  in  the  factored  sequence. 

Proof . See  Jackson  (1983). 

The  result  given  in  Theorem  2 below  is  a formalization  of  a result  which 
appeared  without  proof  in  McCormick  (1983). 


10 


Theorem  2.  (Dyadic  Hessians) 

Let  f ( x ) be  a factorable  function  in  Rn,  let  [fi(x),  f2(x),  ...»  fL(*)] 

be  a factored  sequence  for  f(x),  and  assume  that  all  functions  are  twice  con- 
tinuously differentiable.  Then  the  Hessian  V2f(x)  = V2fL(x)  Can  be  written  as 
a sum  of  outer  product  matrices  of  the  form  (a:aia2),  where  ai  and  a2  are 
gradients  of  factored— sequence  functions  and  the  scalar  a is  composed  of  a 
product  of  f actored-sequence  functions  and  first  and  second  derivatives  of  the 
single-variable  transformations,  Ti,  used  in  the  factored  sequence. 

Proof . See  Jackson  (1983). 

Although  the  proofs  of  Theorems  1 and  2 are  not  included,  the  monadic  and 
dyadic  structure  of  the  gradient  and  Hessian  of  a factorable  function  are 
exhibited  by  displaying  the  gradients  and  Hessians  of  the  forms  in  (1.2)  as  in 
Tables  1 and  2. 

In  order  to  clarify  these  concepts,  consider  again  the  illustrative 
function  in  (1.3).  Table  3 is  a display  of  the  gradient  and  Hessian  of  this 
function.  The  entries  in  each  column  are  the  summands  In  the  expressions  for 
the  gradient  and  Hessian.  For  example, 

Vf  » [sin[bTx]  ] [expfcTx]  ]a  + [aTx]  [cos  [bTx]  ] [exp [cTx]  ]b 
+ [aTx]  [sin[bTx]  ] [exp  [cTx]  ] c, 

or,  in  polyadic  notation 

Vf  » ([sin[bTx]  ] [exp[cTx]  ] :a)  + ( [a^x]  [ cos  [bTx]  ] [exp  [cTx]  ] :b) 

+ ( [a^x]  [sintb^x]  ] [exp  [c^x]  ] :c) . 

The  table  also  Illustrates  the  lef t-to-right , tree-like  structure  of  the  deri- 
vatives involved.  From  the  table  it  can  be  seen  that  both  the  gradient  and 
Hessian  naturally  have  the  polyadic  structure  discussed  above.  Notice  too 
that  the  vectors  in  the  monads  and  dyads  are  drawn  from  the  set  {a,b,c},  each 

of  which  is  the  gradient  of  a factored  sequence  function  in  (1.4). 
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TABLE  1 


GRADIENTS  OF  FACTORABLE  FUNCTION  FORMS* 


Rule 

fi 

Vfi 

I 

~ xl 

®I 

2a 

fj(i)  + fk(i) 

Vfj(i)  + Vfk(i) 

2b 

fj(i)  * fk(i) 

Vfk(i)fj(i)  + 7fj(i)fk(i) 

2c 

TllfjU)] 

7fj(i)Ti(fj(i)j 

TABLE  2 

HESSIANS  OF  FACTORABLE  FUNCTION  FORMS* 


Rule 

fi 

1 

xi 

®nxn 

2a 

fj(i)  + fk(i) 

2 2 
7 fj(i)  + 7 fk(i) 

2b 

fj(i)  * fk(i) 

2 T 

fj(i)7  fk(i)  + 7fk(i)7fj(i) 

2 T 

+ fk(i)7  fj(i)  + 7fj(i)7fk(i) 

2c 

• 2 • • T 

Ti[fj(i)J7  fj(i)  + 7fj(i)Ti[f j(i)]Vfj(i) 

*Notation:  T[f]  * 3T/3f,  T [f  J * 32T/3f2,  and  =»  the  unit  vector  in  Rn. 
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TABLE  3 


MONADIC  AND  DYADIC  TERMS  IN  GRADIENT 
AND  HESSIAN  OF  ILLUSTRATIVE  FUNCTION* 


f 

Vf 

(summands ) 

V2f 

(summands ) 

(sin[bTx}exp[cTx]  :a) 

(cos  [bTx]exp[cTx]  :ab) 
(sin[bTx]exp  [cTx]  :ac) 

(cos  [bTx]exp  [cTx]  :ba) 

aTxsin[bTx]exp  [c^x] 

(aTxcos  [bTx]exp  [cTx]  :b) 

(-aTxsin[bTx]exp  [cTx]  : bb) 

(aTxcos  [bTx]exp  [cTx] : be) 

(sin[bTx]exp [cTx]  :ca) 

(a^xsintb^xjexp [c^x]  :c) 

(aTxcos  [bTx]exp[cTx]  :cb) 

(a^xsin[bTx]exp  [cTx]  : cc) 

*N «B » Since  the  meaning  is  clear,  we  have  dropped  a level  of  parentheses  in 
these  expressions  to  be  able  to  fit  the  table  on  one  page.  The  same  comment 
holds  for  Table  4. 
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2.3  The  N^-Order  Case 


The  next  theorem  is  fundamental  to  operations  with  higher  derivatives  of 
factorable  functions,  and  provides  the  necessary  tool  to  use  in  proving  that 
a.11  tensors  of  factorable  functions  are  polyadics  in  gradients  of  factored- 
sequence  functions . 

Theorem  3.  (Differentiation  of  Polyads) 

Let  f(x)  be  a factorable  function  in  Rn,  let  [f^Cx),  f2(x),  ...»  fL(x)] 
be  its  factored  sequence,  and  assume  that  all  functions  are  members  of  CN+1 • 
Consider  the  N-ad 

( ci . a j 32  ...  afj ) , 

where  the  aj  are  gradients  of  f actored-sequence  functions  and  a is  a scalar 
composed  in  general  of  a product  of  factored-sequence  functions  and  deriva- 
tives (no  higher  than  order  N)  of  the  single -variable  transformations  used  in 
forming  the  factored  sequence.  The  gradient  of  this  N-ad  is  the  sum  of  (N+l) 
ads,  each  term  of  which  has  the  structure  defined  above. 

Proof . Because  the  proof  of  this  theorem  illustrates  the  technique  of  polya- 
dic  differentiation,  its  salient  features  are  included  here.  For  convenience 
write  fi  for  fi(x).  Then  the  scalar  a is  of  the  form 

a - H n£, 
l 

where 

Hi  - frU)  ££  3k(Z){Ts(Jt)[fr(i)]}/3f^j, 

for  values  of  r(Jt),  s(Z),  and  k(Jl)  which  respectively  define,  for  the 
factor  of  a,  the  factored-sequence  function,  the  transformation,  and  the 
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level  of  the  derivative  of  the  transformation  used.  Note  that  k(£)  < N. 
By  the  chain  rule  of  differentiation. 


where 


and 


H 


or 


Va  - l He i, 
l 

■ 9k(«)+1{Ts(i)(£ra)1}/3£W«+1> 

Cjt  - Vfr(jt). 


Furthermore,  since  the  aj  in  the  N-ad  are  gradients  of  f actored-sequence 
functions,  Vaj  can  be  written  by  the  previous  theorem  as  the  sum  of  outer 
product  matrices  of  order  2 in  gradients  of  f actored-sequence  functions  as 
follows: 


T 

\ bp3pqbq  = \ (3pq*bpbq), 

(p,q)elj  (p,q)elj 


where  Ij  » {(p,q)  | bp3pqbq  is  a term  in  the  dyadic  representation  of  Vaj } 
and  Spq  » 3qp,  is  of  the  same  form  as  a.  Then,  by  straightforward  term-by- 
term  differention  and  collection  of  terms  (see  Jackson  and  McCormick  (1984)), 
it  can  be  shown  that  the  gradient  of  the  N-ad  is 


v(a:ala2  ***  aN>  * l (Yz:ai  ***  aNci) 

l 

+ l (a8pq:bp  •••  aNbq) 
(p,q)eli 


+ I (a8pq:al  ***  aN— lbpbq)* 

(p.q)elN 

which  is  a sum  of  (N+l)— ads  in  gradients  of  factored  sequence  functions  with 
leading  scalars  of  the  proper  form,  and  the  result  is  proven. 
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Now  the  main  theorem  on  the  structure  of  high-order  tensors  of  factorable 
functions  can  be  given. 

Theorem  4.  (Polyadic  Tensors) 

Let  f(x)  be  a factorable  function  in  Rn,  let  [f^(x),  f2(x),  fL(x)] 

be  a factored  sequence  for  f(x),  and  assume  that  f(x)  e CN  and  f^(x)  e CN»  for 
i*l,  . ..,  L.  Then  the  Nth-order  tensor  of  f(x)  can  be  written  as  the  sum  of 
generalized  outer  product  matrices  of  the  following  special  form.  Let 
(a:ai  •••  a^)  be  a polyad  associated  with  one  of  the  outer  product  matrices. 
Then  each  a^  is  the  gradient  of  some  function  in  the  factored  sequence,  and 
the  scalar  a is  a product  of  functions  in  the  factored  sequence  and  deriva- 
tives of  the  single-variable  transformations  used  in  defining  the  functions  in 
the  sequence.  Only  derivatives  3^T[fJ/3f^,  for  1 < k < N,  are  used. 

Proof . The  proof  is  a straightforward  application  of  Theorem  3 and  induction; 
the  reader  is  referred  to  Jackson  and  McCormick  (1984)  for  the  details. 

At  this  point  another  example  is  offered  to  help  illustrate  these  ideas. 
To  keep  matters  as  simple  as  possible,  the  same  illustrative  function  in 
(1.3),  whose  gradient  and  Hessian  were  displayed  in  Table  3,  will  be  used. 
Calculation  of  the  third-order  tensor  of  this  function  is  a therapeutic 
exercise,  but  only  the  final  result  is  given  here  in  Table  4.  The  terms  in 
the  Hessian  from  Table  3 are  also  included  in  Table  4 to  make  it  easier  to  see 
the  lef t-to-right,  tree-like  connections  among  successive  gradient  terms. 
Notice  that  the  vectors  that  make  up  each  triad  are  once  again  drawn  from  the 
set  {a,b,c}  and  that  the  scalar  multiples  in  the  triads  are  products  of 
f actored-sequence  functions  (e.g.,  a^x,  sin[bTx]  , exp[cTx])  and  first  and 
second  derivatives  of  the  transformations  (e.g.,  cosfbTx],  -sin[bTx]).  Notice 
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TABLE  4 


DYADIC  AND  TRIADIC  TERMS  IN  HESSIAN  AND  THIRD 
ORDER  TENSOR  OF  ILLUSTRATIVE  FUNCTION 


V2f 

(summands ) 

V3f 

(summands ) 

(cos  [b^xjexp  [c^x]  :ab) 

(-sin[bTx]exp  [cTx]  :‘abb) 
(cos  [b^xjexp  [c^x]  :abc) 

(sin[bTx]exp [cTx] : ac) 

(cos  [bTx]exp  [cTx]  :acb) 
(sin[bTx  jexp  [cTx] : acc) 

(cos  [bTx]exp[cTx]  :ba) 

(-sin[bTx]exp[cTx]  :bab) 
(cos  [bTxjexp [cTx]  :bac) 

(-aTxsin[bTx]exp  [cTx]  :bb) 

(-sin[bTx]exp[cTx]  :bba) 
(-a^xcos  [bTxjexp  [c^x]  :bbb) 
(-aTxsin[bTx]exp[cTx]  :bbc) 

(aTxcos  [bTx]exp[cTx]  :bc) 

(cos  [bTx]exp  [cTx]  :bca) 
(-aTxsin[bTx]exp[cTx]  :bcb) 
(-aTxcos  [bTx]exp[cTx]  :bcc) 

(sin[bTx]exp[cTx]  :ca) 

(cos  [bTx]exp  [cTx]  :cab) 
(sin  [bTx  jexp  [cTx]  :cac) 

(aTxcos  [bTx]exp(cTx]  :cb) 

(cos  [bTx]exp  [cTx ] :cba) 
(-aTxsin[bTx]exp  [cTx]  :cbb) 
(-aTxcos  [bTx]  exp  [c^x]  : cbc) 

(a^xsin[bTx]exp [c^x] : cc) 

(sin[bTx]exp  [cTx]  : cca) 
(-aTxcos  [bTx  jexp  [c^x]  :ccb) 
(a^xcos  [bTx]exp  [cTx]  : ccc) 
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too  that  the  only  new  calculations  needed  to  form  V^f  are  the  third  deriva- 
tives of  the  single  variable  transformations:  sin,  cos,  and  exp. 

There  are  several  computational  advantages  to  be  stressed  here.  One 
advantage  is  a result  of  the  symmetry  of  partial  derivatives,  i.e., 

/Sx^Sxj  3x^  =*  3-^f /3xj  3xk3x£,  etc.  Because  of  this  symmetry,  every  triad  in 
the  V3f  column  in  Table  4 whose  vector  factors  are  the  same,  disregarding 
order,  has  the  same  associated  scalar.  For  example,  abb,  bba,  and  bab,  all 
have  the  same  associated  scalar:  [-sin[  b^-x]  ] [exp  [c^x]  ] . Thus  only  six 

distinct  scalars  need  to  be  computed  to  form  V^f.  Moreover,  because  of  what 
may  be  called  the  "persistence  property"  of  the  derivatives  of  sin  and  exp  in 
this  example  , the  calculation  of  these  six  scalars  is  also  simplified.  That 
is,  all  derivatives  of  exp  are  equal  and  alternating  derivatives  of  sin  are 
equivalent  except  for  their  signs.  This  too  can  be  exploited  to  reduce 
computational  effort.  Finally,  since  the  vector  factors  in  the  triads  are 
members  of  the  same  set  from  which  the  dyads  in  the  Hessian  and  the  monads  in 
the  gradient  are  formed,  a computer  code  that  calculates  these  need  only  store 
a,  b,  and  c,  and  a set  of  pointers  for  each  monad,  dyad,  and  triad  indicating 
which  vector  is  required  for  each  position.  Of  course,  these  pointers  can  be 
packed  to  save  storage. 

The  current  versions  of  the  factorable  programming  input  routines 
(FACSUMT  and  FACT  IN)  take  advantage  of  each  of  the  above  in  computing  the 
gradients  and  Hessians.  Work  is  underway  to  extend  these  codes  to  capitalize 
on  the  polyadic  structure  of  factorable  function  tensors  and  compute  high- 
order  derivatives  in  the  efficient  manner  discussed  above.  See  Jackson  and 
McCormick  (1984)  for  more  on  this  current  work. 
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3.  SECOND-ORDER  SENSITIVITY  ANALYSIS  IN  NONLINEAR  PROGRAMMING 


3.1  Basic  First-Order  Sensitivity  Results 
One  application  of  the  results  in  Section  2 is  in  obtaining  high-order 
sensitivity  information  for  nonlinear  programming  problems,  although  only  the 
second-order  case  is  considered  in  this  paper.  Sensitivity  analysis  in 
nonlinear  programming  is  concerned  with  analyzing  the  behavior  of  a local 
solution  when  the  problem  functions  are  perturbed  slightly.  This  perturbation 
might  be  due  to-  an  inexactness  with  which  certain  parameter  values  in  the 
problem  are  calculated  or  because  the  optimization  model  was  parameterized  and 
one  is  interested  in  the  solution  for  a variety  of  values  of  the  parameters. 
For  additional  information  on  this  topic,  see  Armacost  and  Fiacco  (1974), 
Armacost  and  Fiacco  (1978),  Fiacco  (1980)  and  especially  Fiacco  (1983)  and  its 
excellent  bibliography.  The  parametric  problem  is  written 

P(e):  minimize  f(x,e), 

x e Rn  (3.1) 

subject  to  gi(x,e)  > 0, 

for  i»l,  ...»  m,  where  e is  an  (r  x .1)  vector  of  parameters.  The  more  general 
version  of  the  sensitivity  problem  includes  equality  constraints,  but  these 
are  not  included  here  for  simplicity.  The  ideas  and  results  presented  in  this 
section  generalize  readily  to  the  equality-constrained  problem. 

The  essence  of  sensitivity  analysis  in  nonlinear  programming  Is  the 
application  of  the  Implicit  Function  Theorem  (see,  e.g.  Bliss  (1946)  to  the 
Karush-Kuhn-Tucker  (KKT)  necessary  conditions  for  the  problem,  P(e),  in  (3.1). 
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First,  define  the  Lagrangian  for  P(e)  as 


m 

L(x,u,e)  = f(x,e)  - l uigi(x,e). 

i-1 

Then,  assuming  continuous  differentiability  in  x of  the  problem  functions,  the 
KKT  conditions  for  P(e)  are  that  there  exists  a feasible  point,  x,  for  (3.1) 
and  associated  vector  of  Lagrange  multipliers,  u,  such  that 


VL(x,u,e)  - 0, 

uiSi(x»e ) * (3.2) 


ui  > 0, 

for  i = 1 , . . . , m. 

The  statement  of  the  first-order  sensitivity  results  given  in  Theorem  6 
below  also  requires  that  the  second-order  sufficient  conditions  (SOSC)  hold 
at  a particular  solution  x,  of  P(e)  (which  is  just  (3.1)  for  a specified 
vector  of  parameter  values,  e).  These  conditions  may  be  written  as  follows. 


Theorem  5.  (SOSC) 

Let  x be  a feasible  point  for  P(e)  and  assume  that  the  functions  of 
P(e)  are  twice-continuously  differentiable  in  x in  a neighborhood  of  x.  Let 
(x,u,e)  be  a triple  that  satisfies  the  KKT  conditions  in  (3.2)  and  define 


B - (i  | gi(x,e)  - 0}, 

and 

D * {i  | ui  > 0} . 

Further,  suppose  that 

d^  V^L(x,u,e  )d  > 0, 

for  all  d * 0,  such  that 

T A A 

d Vg^(x,e)  > o,  for  all  i e B, 

and 

T 

d ^gi(x,e)  * 0,  for  all  i e D. 

Then  x is  a strict  local  minimizer  for  P(e). 
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The  earliest  known  reference  to  these  conditions  is  Pennisi  (1953), 
although  it  was  almost  15  years  (see  McCormick  (1967),  and  Fiacco  and 
McCormick  (1968))  before  they  were  more  fully  developed  and  exploited. 

The  following  theorem  can  be  viewed  as  the  basic  result  in  nonlinear 
programming  sensitivity  analysis. 

Theorem  6.  (First-Order  Sensitivity) 

Let  x be  a feasible  point  for  P(e)  and  assume  that 

i.)  the  functions  in  (3.1)  are  twice -continuously  differentiable  in  x and 
the  cross-partial  derivatives  exist  and  are  jointly  continuous  in 
x and  e in  a neighborhood  of  (x,e); 

ii.)  the  second-order  sufficient  conditions  (Theorem  5)  for  a local 

A A A 

minimum  of  P(e)  hold  at  x,  with  associated  Lagrange  multipliers  u, 

iii.)  the  gradients  of  the  binding  constraints,  i.e.,  Vgi(x,e),  i e B,  are 
linearly  independent ; and 

iv.)  u^  > 0 for  all  i e B. 

Then,  for  e in  asuf f iciently  small  neighborhood  of  e,  there  exists  a unique, 
once-continuously  differentiable  vector  function 

y(e)  » [x(e)T,  u(e)T]T} 

satisfying  the  KKT  conditions  for  P(e),  with 

y(e)  - [x(e)T,  u(e)T]T> 

such  that  x(e)  is  a locally  unique  isolated  minimizer  for  P(e).  Furthermore, 
the  first  partial  derivatives  of  x(e)  and  u(e)  with  respect  to  e can  be 
obtained  from  the  equation 

Y - -M-lN,  (3.3) 
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where 


3xi  3xi 

■ • • • ' — ■ ■ 

3 e i 3er 


• • 
• 

• 0 
0 0 

• 

3xn 

• • 

3xn 

3e^ 

3er 

3ui 

3ui 

3ei 

• 

3e 

m 0 

0 0 

• 

3um 

0 0 

3um 

3ei 

3er 

2 

- 

- 

2 
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T 

“Vgl 
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» • “ 7gm 

7exL 

T 
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T 

gl 
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T 

U27g2 

0 
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. . 0 
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• 

• 

• 

• 

• 

• • 

• 

• 

T 

• 

• 

• • 

• 

T 

0 

0 . 

• • gm 

with  all  quantities  in  M and  N evaluated  at  (x  , u ) . 
Proof  . See  Fiacco  (1983). 
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3.2  Development  of  the  Second-Order  Equation 


The  result  in  (3.3)  provides  a direct  way  of  calculating  first-order 
sensitivity  information  and  is  the  point  of  departure  for  the  work  given  here, 
which  begins  with  the  following  theorem  due  to  Fiacco  (1983). 

Theorem  7 . (Higher-Order  Sensitivity) 

Let  x be  a feasible  point  for  P(e)  and  assume  conditions  (i)  through 
(iv)  in  Theorem  6.  Assume  also  that  all  (k+l)st-order  partial  derivatives  in 
x and  all  (k+l)st-order  cross  partial  derivatives  in  x and  e exist  and  are 

A A 

jointly  continuous  in  x and  e in  a neighborhood  of  (x,e).  Then  in  a 
sufficiently  small  neighborhood  of  e,  the  vector  function  y(e)  is  k times 
continuously  differentiable. 

Proof . The  proof  follows  directly  from  the  fact  that  if  the  Jacobian,  M,  of 
the  KKT  conditions  in  (3.2)  is  nonsingular,  and  if  the  functions  involved 
possess  the  appropriate  degree  of  differentiability,  the  Implicit  Function 
Theorem  (see  Bliss  (1946),  p.  270)  guarantees  the  existence  of  the  higher- 
order  partial  derivatives.  Nonsingularity  of  M for  (3.2)  was  shown  in  Fiacco 
and  McCormick  (1968).  Thus  the  theorem  is  proved  by  direct  application  of  the 
Implicit  Function  Theorem. 

If  this  high-order  sensitivity  information  is  to  be  used,  it  is  necessary 
also  to  develop  a convenient  mechanism  for.  calculating  it.  This  is  done  next 
for  the  case  when  k = 2.  Consider  the  matrix  equation  in  (3.3)  and  rewrite 
it  as 

MY  - -N. 

The  (i,j)c^  element  of  this  matricial  equation  is 

I ^is  ^s j * “Nij , 
s 
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and  the  total  derivative  of  this  equation  is  obtained  using  the  product  rule 
as  follows: 


or 


r , 3 Vi  ‘Mis  , dN1;1 

l UK1  + de“  1 - - 3TT1  • 


3ek 


3Y 


r WJ-si  r ^ 

l Mis  “ l 7 


dMis 


dNij 


3ek 


de^  SJ  de^ 


(3.4) 


Next,  the  chain  rule  is  required  in  finding  the  right-hand-side  (rhs)  in  (3.4) 
since  each  element  of  the  M and  N matrices  is  a function  of  each  element  of  x, 
u,  and  e.  Hence, 


dMis  a y 3Mis  3yt  + 3Mis 

dek  * t 3yt  3ek  3ek 

and 

dNij^  ^ y 3Ni^  3yt  + 

dek  * l 3Yt  3ek  + 3ek 


Using  these  and  the  fact  that  Ysj  ■ 3ys/3ej,  the  rhs  in  (3.4)  can  thus  be 
written 

i 1 1 Sr Ytk  + Sf + 1 ^ Yck  + S1  • 

Now  consider  the  left-hand-side  (lhs)  in  (3.4)  more  closely.  It  can  be 
written 

a2 

_ r M _i_ys_ 
s iS  3ek3(=j 


which  is  of  the  form 


” I ^iS  Ag-ife, 


which  in  turn  gives  the  (i,j,k)ch  element  of  the  three-dimensional  array  that 
results  from  the  matrix  multiplication  of  M and  the  kth  (counting  from  front 
to  rear)  two-dimensional  matrix  of  A that  is  parallel  to  the  xz-plane  in  r3  . 
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Observe 
(3.4)  by  H - 


3_yi 

3e  j 3e^ 


that  3^yi/3ek3£j  can  b®  obtained  by  premultiplying  both  sides  of 
M“l.  Then 


3N 


V rr  r ? ---  + li-i  + V ---  + -~i  l 

t Hir  s t 3yt  3£k  3ek  3£ j t 3yt  3£k  3£k 


(3.5) 


which  is  the  (i,j,k)th  element  of  the  three-dimensional  array  of  second 
partial  derivatives  of  y 3 [x(e)T,  u(e)T]T  with  respect  to  e. 

It  is  desired  next  to  write  (3.5)  using  array  notation.  Because  these 
operations  are  performed  in  three-space,  however,  some  new  notation  must  be 
developed  before  this  Is  rewritten.  In  order  to  motivate  the  new  notation, 
consider  a multivariable  function  f(x).  It  is  perhaps  clear  In  this  case  what 
is  meant  by  Vf,  V2f  and  V3f;  i.e.,  Vf  is  a vector  that  is  written  down  the 
page,  V2f  requires  taking  the  gradient  of  each  element  of  Vf  and  writing  that 
result  across  the  page  to  form  a two-dimensional  matrix  , and  thus  to  form  V^f 
one  would  take  the  gradient  of  each  element  of  V2f  and  write  that  result  Into 
the  third  dimension  (into  the  page,  say).  Hence  if  M is  a matrix,  it  should 

be  clear  what  Is  meant  by  VM;  i.e.,  take  the  gradient  of  each  element  of  M and 

write  the  result  into  the  third  dimension. 

But,  if  y is  a vector  in  R^,  there  are  three  possible  orientations 
parallel  to  the  coordinate  planes  In  for  the  matrix  usually  notated  as  Vy. 

These  are  shown*  in  Figure  1.  In  order  clearly  to  differentiate  among  these, 
the  notation  will  be  used.  Thus  V{»}  operates  on  the  argument  by 

taking  Its  gradient  into  the  third  dimension.  Note  that 

Vy  * V{  y } * V{yT}. 

Whereas  the  elements  of  these  vectors  are  the  same,  their  orientations  in 

*A11  graphs  were  produced  using  DATAPLOT  as  described  in  Filliben  (1981). 
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FIGURE  1 

POSSIBLE  ORIENTATIONS  OF  A MATRIX  IN  THREE-SPACE 


three-space  are  not.  These,  too,  are  shown  in  Figure  1.  The  same  comments 
apply  with  regard  to  total  differentiation,  for  which  the  notation  d{  •}  will 
be  used.  This  notation,  of  course,  is  only  used  for  an  operation  from  R^  to 
r3;  analogs  exist  for  higher  dimensions. 

Now  with  this  notation,  the  matrix  form  of  (3.5)  is  written: 

2 

Veey  - -M-1[VyM  Ve{y}  Vey  + VeM  Vey  + VyN  Ve{y}  + VeN] , (3.6) 


where,  letting  p =»  (n  + m), 

2 

Veey  is  (p  x r x r) 

M-1  is  (p  x p x 1) 

VyM  is  (p  x p x p) 

Ve{y}  is  (p  x 1 x r) 


Vey  is  (p  x r x 1) 
^eM  is  (p  x p x r) 
VyN  is(pxrxp) 
VeN  is  (p  x r x r) 


and  the  multiplication  in  three-space  is  carried  out  so  that  each  array  within 

the  brackets  on  the  rhs  of  (3.6)  is  (p  x r x r).  This  (p  x r x r)  array  must 
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then  be  pre  mu  It ip lied  by  M"1,  a (p  x p x 1)  array.  This  mult iplication  is 

effected  by  premultiplying  each  of  the  r matrices  of  dimension  (p  x r)  by 

M"1,  resulting  in  r matrices  of  dimension  (p  x r),  or  a (p  x r x r)  array 

2 

again,  as  required  by  Veey.  More  on  three-dimensional  array  multiplication 
(including  graphical  depictions)  is  given  in  Section  3.5. 

Observe  that  (3.3)  could  have  been  differentiated  directly  using  the 
techniques  for  differentiation  of  matrices  as  given  in  Marlow  (1978).  He  uses 
Kronecker  products  and  a special  two-dimensional  representation  of  three- 
dimensional  arrays  for  performing  this  kind  of  differentiation.  A disad- 
vantage of  this  approach  is  that  whatever  special  structure  that  exists  in  the 
three-dimensional  arrays  is  lost  in  the  process.  This  "structural  integrity" 
is  maintained  in  the  approach  given  here,  allowing  the  more  detailed  analysis 
given  in  the  next  section. 

There  has  been  some  previous  work  in  second-order  sensitivity  analysis. 
Dembo  (1982)  derived  a computationally  efficient  technique  for  getting  an 
approximation  to  the  second  derivative  with  respect  to  e of  the  vector  func- 
tion y,  correct  to  terms  of  order  two.  By  contrast,  (3.6)  is  exact.  Also,  a 
result  for  the  geometric  programming  problem  for  the  case  where  e is  a scalar 
was  provided  by  Kyparisis  (1983). 

The  material  in  the  remainder  of  this  section  addresses  the  issue  of 
computational  efficiency  when  the  problem  functions  are  factorable.  Sections 
3.3  and  3.4  are  admittedly  rather  detailed.  The  reason  for  including  them  is 
twofold:  to  illustrate  the  special  polyadic  structure  of  the  tensors  in 

nonlinear  programming  sensitivity  analysis,  and  to  stimulate  more  research 
in  these  areas.  It  is  not  necessary  to  wade  through  this  material  for  each 
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second-order  sensitivity  calculation.  Ultimately  this  will  be  performed 
automatically  by  the  Factorable  Programming  system  of  programs  being  developed 
at  the  National  Bureau  of  Standards. 


3.3  Structure  of  the  Three-Dimensional  Arrays 
While  the  formula  in  (3.6)  may  be  mathematically  succinct,  it  may  not  be 
obvious  how  the  polyadic  structure  of  derivatives  of  factorable  functions  can 
assist  in  its  calculation.  To  understand  how  these  calculations  are  per- 
formed, it  is  necessary  to  investigate  further  the  structure  of  the  three- 
dimensional  arrays  involved.  These  are:  VyM,  VeM,  VyN,  and  VeN.  Since 

VeN  is  the  simplest  of  these.  It  is  considered  first.  Strictly  speaking,  the 
(i,j,k)c^  element  of  VeN  is 


(7eN)ijk 


3eit3e  j3xi 


ui-n  3 Si-n 

j 


i < n 


i > n. 


This  can  be  thought  of  as  taking  the  gradient  with  respect  to  £ of  each 
element  of  N into  the  third  dimension,  and  thus  can  be  pictured  as  a parti- 
tioned rectangular  parallelepiped  as  shown  in  Figure  2.  The  partitioning, 
which  is  due  to  the  parts  in  y,  separates  VeN  into  an  3 "upper”  part  which  is 
VeexL  and  a "lower”  part  which  Is  just  a stack  of  constraint  Hessians  with 
respect  to  e . This  more  detailed  structure  Is  shown  in  the  exploded  view  of 
VeN  given  in  Figure  3. 


28 


V, 


FIGURE  2 

BASIC  PARTITIONED  STRUCTURE  OF  V.N 


FIGURE  3 

EXPLODED  VIEW  OF  STRUCTURAL  DETAILS 

OF  V.N 
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The  next  simplest  array  to  portray  is  VyN.  This  is  a three-dimensional 
array  that  has  four  parts,  again  a result  of  the  two  parts  in  y,  arranged  as 


shown  in  Figure  4.  These  partitions  are  described  mathematically  as: 


i<n,  k<n 


2 

3 gi-n 


i>n,  k<n 


i<n,  k>n 


i>n , k>n , i»k 


0 


otherwise 


These  are  depicted  graphically  in  the  exploded  view  in  Figure  5,  where  the 
different  orientations  of  the  various  matrices  and  vectors  in  three-space 
are  more  easily  grasped. 

The  next  three-dimensional  array  to  consider  is  V£M.  This  too  is  par- 
titioned into  four  smaller  three-dimensional  arrays,  but  with  an  orientation 
that  differs  from  The  basic  structure  of  the  parts  is  shown  in  Figure  6, 

and  each  element  of  the  array  is  defined  in  the  following  equations: 


3 

3 L 


i<n,  j <n 


3e]c3xj3xi 


i<n,  j>n 


(VeM)ijk 


u 


i>n,  j <n 


dgj-n 

3ek 


i>n,  j>n , i-j 


0 


otherwise 
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FIGURE  4 

BASIC  PARTITIONED  STRUCTURE  OF  V,N 


-VUi 


Vgi 


EXPLODED  VIEW  OF  STRUCTURAL  DETAILS 

OF  VyN 
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The  three-dimensional  orientation  of  these  matrices  and  vectors  is  shown  in 


more  detail  in  Figure  7. 

The  last  three-dimensional  array  to  be  depicted  is  VyM,  the  most  compli- 
cated, with  eight  parts  that  result  from  differentiating  three  times  with 
respect  to  the  partitioned  vector  y.  The  eight  parts  are  shown  in  Figure  8, 
and  the  mathematical  statement  of  the  (i,j,k)c^  element  for  each  case  is  given 
below. 
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i<n,  j <n, 

k<n 

3xk3xj  3xi 

9 

— 

a2 

3 8.1 -n 
3xk3xt 

9 

i<n,  j>n. 

k<n 

( VyM  ) i j k.  “ 

ui-n 

a2 

3 g i-n 
3xfc3xj 

9 

i>n,  j<n, 

k<n 

3gi-n 

3xk 

9 

i>n , j >n , 

k<n,  i=*j 

a2 

3 gk-n 

i<n,  j <n. 

k>n 

3xj3x£ 

9 

0 

9 

i<n,  j>n. 

k>n 

3gi-n 

3xj 

9 

i>n,  j <n. 

k>n , i=k 

0 

9 

i>n,  j>n. 

k>n 

— 

0 

9 

otherwise 

• 

these  structures  are 

shown  in 

detail 

in  Figure 

9. 
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V. 


FIGURE  6 

BASIC  PARTITIONED  STRUCTURE  OF  V,M 


FIGURE  7 

EXPLODED  VIEW  OF  STRUCTURAL  DETAILS 

OF  V,M 
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BASIC  PARTITIONED  STRUCTURE  OF  V,M 


FIGURE  9 

EXPLODED  VIEW  OF  STRUCTURAL  DETAILS 

OF  V,M 
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3.4  Polyadics  In  Second-Order  Sensitivity  Analysis 


3.4.1  The  Dyadics  In  the  Second-Order  Terms 
Although  the  material  in  the  previous  section  provided  insight  into  the 
three-dimensional'  structure  of  the  second-order  sensitivity  analysis  formula 
in  (3.6),  it  still  may  not  be  clear  how  the  natural  polyadic  structure  of 
tensors  of  factorable  functions  can  assist  in  the  evaluation  of  the  formula. 

To  see  this,  it  is  first  necessary  to  show  that  each  substructure  exhibited  in 
Figures  3,  5,  7,  and  9,  is  a polyadic,  and  then  to  demonstrate  how  the  multi- 
plication with  Vey  (or  Ve { y } ) is  to  be  carried  out.  This  section  addresses  the 
former  of  these  activities  by  providing  proofs  that  factorable  functions  of 
(x,e)  have  polyadic  derivatives.  Section  3.5  addresses  the  latter. 

First  notice  that  if  the  functions  of  the  problem  given  in  (3.1)  are 

factorable  in  x and  e,  so  too  is  the  Lagrangian  of  that  problem  factorable  in 

x and  e.  Then  the  task  of  this  section  reduces  to  considering  some  function 
f(x,e)  - fL(x,e),  with  factored  sequence  [fi(x,e),  f2(x,e),  ...,  fL(x,e)] 
formed  using  the  following  modification  to  the  rules  given  in  (1.2). 

Rule  1 . For  i < n, 

fi(x,e)  - x£. 

Rule  2.  For  i > n,  either 

a. )  f i(x,e)  » f j (i)(x,e ) + fk(i)(x,e);  or  (3.7) 

b. )  f i(x, e)  » fj(i)(x,e)  • fk(i)(x,e);  or 

c. )  f i(x,e)  - Ti[f  j(i)(x,e),e]; 

where  j(i)  < i,  and  k(i)  < i,  and  the  derivatives  with  respect  to  e of  the  T< 
are  themselves  factorable  functions  in  e.  For  convenience,  in  the  rest  of 
this  section,  let  j(i)  * j , k(i)  *■  k,  and  drop  the  arguments  (x,e)  and 
tf  j(i)(x,e)l  • Thus,  e.g.,  Ti[f  j(i)(x,e),e]  -►  Ti# 
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The  gradients  with  respect  to  x of  the  forms  in  (3.7)  are  given  in 
Table  5.  Calculation  of  the  Hessians  with  respect  to  x of  these  forms  is 
straightforward  and  the  result  is  given  in  Table  6.  Notice  that  there  is  no 
difference  between  Table  6 and  Table  2.  Therefore  Theorem  2 applies  here  and 
it  is  possible  to  write  V2f(x,e)  as  a sum  of  dyads  of  the  appropriate  form. 

The  calculation  of  the  matrix  of  second  partial  derivatives  of  f±(x,e) 
with  respect  to  x and  e is  slightly  more  complicated  by  the  fact  that  T^  is  a 
composite  function  of  f(x,e)  and  e.  This  requires  the  chain  rule  to  obtain 
the  second  derivative  matrix.  Hence, 

2 • • T .T 

DexTi  = DelTiVfj]  - Ti7exfj  + Vf-jT^f-j  + 7fj7eTi  . 

With  this,  the  formulae  for  the  matrix  of  second  partials  of  f^  with  respect 

to  x and  £ can  be  written  as  in  Table  7.  The  first  appearance  of  these 

formulae  was  in  deSilva  and  McCormick  (1978). 

It  should  be  apparent  from  Table  7 that  an  inductive  argument  paralleling 

2 

that  used  in  the  proof  of  Theorem  2 would  yield  the  fact  that  Vexf  can  also 

2 

be  written  as  the  sum  of  dyads'.  The  difference  is  that  for  Vexf , the  first 

vector  in  the  dyads  is  a gradient  with  respect  to  x of  a f actored-sequence 

function,  and  the  second  vector  is  a gradient  with  respect  to  e of  a factored- 

sequence  function  or  a derivative  of  a single-variable  transformation. 

This  result  implies  that  the  submatrices  of  second-order  derivatives 

which  appear  in  the  arrays  VyM,  VeM,  VyN,  and  VeN  on  the  rhs  of  (3.6)  and  in 

Figures  3,  5,  7,  and  9,  are  all  dyadics.  What  remains  is  to  show  that  the 

subarrays  of  third-order  derivatives,  in  these  same  four  arrays,  are  triadics. 

2 2 

Those  subarrays,  arising  from  the  block  V L in  M and  the  block  VexL  in  N,  are 

3 3 3 3 3 

7 L,  VxexL,  7£xxl>  and  VeexL.  The  first  case,  V L,  is  uncomplicated  by 
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TABLE  5 


GRADIENTS  OF  FACTORABLE  FUNCTION  FORMS 
IN  SENSITIVITY  ANALYSIS 


Rule 

fi 

Vfi 

1 

xi 

el 

2a 

fj  + fk 

Vf j + Vfk 

2b 

fj  * fk 

£j7£k.  + £k7£j 

2c 

Ti 

Ti7£j 

TABLE  6 

HESSIANS  OF  FACTORABLE  FUNCTION  FORMS 
IN  SENSITIVITY  ANALYSIS 


Rule 

fi 

2 

V fi 

1 

Xi 

°Tixn 

2a 

f j + fk 

2 2 
V fj  + V fk 

2b 

f j ’ fk 

2 T 2 T 

fjV  fk  + VffcVfj  + fk7  fj  + VfjVfk 

0 2c 

Ti 

• 2 . • T 

TiV  fj  + VfjTiVfj 
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TABLE  7 


HESSIANS  WITH  RESPECT  TO  x AND  e OF  FACTORABLE 
FUNCTION  FORMS  IN  SENSITIVITY  ANALYSIS 


Rule 

fi 

2 

7exf  i 

1 

xi 

Onxr 

2a 

f j + fk 

2 2 
7exf j + 7exfk 

2b 

f j * fk 

2 T 2 T 

fj7exfk  + vfk7efj  + fk7exfj  + vfjvefk 

2c 

Ti[fj,e] 

2 t .T 

iiWj  + Wjii’gfj  * Wj’cTi 

derivatives  with  respect  to  e,  and  thus  is  a triadic  by  Theorem  4.  The  proofs 
for  the  other  cases  are  in  the  same  vein  as  the  proof  of  Theorem  2 and  require 
that  the  formulae  for  the  third  derivatives  with  respect  to  x and  e respec- 
tively be  derived  for  the  forms  in  Tables  6 and  7.  These  are  developed  next. 

3.4.2  The  Triadic  Form  of  V^xxf 

3 

The  first  step  in  showing  that  Vexxf  is  triadic  is  to  apply  the  operator 
De{*}  to  each  factorable  function  form  in  Table  6.  This  is  straightforward 
for  cases  1 and  2a.  It  Is  also  straightforward  for  2b,  but  since  this  case 
represents  the  first  use  of  the  new  notation,  it  is  developed  below. 

r 2 T 2 T,  3 2 T 

DelfjV  fk  + VffcVfj  + fkV  fj  + VfjVfk}  - fjVexxfk  + V fkVe{  f j } +VfkVe{Vfj} 

T 3 2 

+ 7fjVe{Vfk}  + fkVexxfj  + V fjVe{fk| 

+ Vf  jVe{  Vf^}  + VfkVe{Vfj}. 
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The  n»re  complicated  case  is  2c,  which  is  a result  of  the  fact  that,  as 

• • • 

noted  earlier,  each  T*  (and  hence  T*  and  Ti)  is  a composite  function  of  f(x,e) 
and  e,  requiring  the  chain  rule  to  calculate  the  total  derivative.  This  is 
shown  below. 


2 T 3 2 # 

DE{Ti.V  fj  + VfjiiVfj}  -TiVexxfj  + V fjDslTtl  + MvfjjTiVfj 


T • • T 

+ TfjDejiilVfj  + WjTivs[7fj}. 


(3.8) 


However,  using  the  chain  rule, 


De(fi}  - + Ve{ii}, 

and  (3.9) 

- TlVe{fj}  + 7e{ii}. 

After  using  (3.9),  the  rhs  of  (3.8)  becomes: 

3 . . 2 2 

TiVexxfj  +T"i7  fjMfj}  + 7 fjVe{Ti} 

T T 

+ Ve(Vfj  }TiVf  j + Vfj'TiVelfjJVfj 

+ Vf  j Vc{iI}Vf  j + Vf  jTiVe{  Vf  j } . 

3 

These  third-derivative  formulae  for  Vexxf  are  collected  in  Table  8.  The 
3 

proof  that  Vexxf  is  triadic  is  a straightforward  application  of  induction  to 

the  factored  sequence  and  mimics  the  argument  used  in  the  proof  of  Theorem  2. 

3 

The  result  is  that  Vexxf  0311  be  written  as  the  sum  of  triads  of  the  form 


(a:aia2a3) , 

where 

a\  is  (n  x 1) , 

a2  is  (n  x 1) , 

a3  is  (r  x 1), 
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TABLE  8 


THIRD-ORDER  TENSORS  WITH  RESPECT  TO  x,  x,  AND  e OF  FACTORABLE 
FUNCTION  FORMS  IN  SENSITIVITY  ANALYSIS 


Rule 

fi 

3 

7exxf i 

1 

Xi 

Onxnxr 

2a 

fj  + fk 

3 3 

7exxf j + 7exx^k 

2b 

fi  • f j 

3 2 T 

fjvexxfk  + 7 fk7e{fj}  + 7fk7e{7fj} 

T 3 2 

+ Vfj7e{vfk}  + fk7exxfj  + 7 fj7e{fk} 

+ Vfk7e{Vf£l  + Vfk7e{7fj} 

2c 

Tilfj.e] 

.3  2 2 

TiVexxfj  +T*iV  fjVeJfj}  + V fjVe{Ti} 

T T 

+ 7£(7fJ|Ti7£j  + 7fj  Tt7e(fj}7fj 

T • • T 

+ 7fj7e{Ti}Vfj  + Vf  jT i7e{  Vf  j } 

and  a is  a product  of  factored  sequence  functions  and  first,  second,  and  third 
derivatives  of  the  single-variable  transformations  used  in  forming  the  fac- 
tored sequence.  Also,  the  vector  factors  ai  and  a2  are  gradients  with  respect 
to  x of  a factored  sequence  function,  and  ay  is  a gradient  with  respect  to  e 
of  a f actored-sequence  function  or  of  a first  or  second  derivative  with 
respect  to  f of  one  of  the  single -variable  transformations  in  the  factored 
sequence . 
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3 

3.4.3  The  Triadic  Form  of  Vx£xf 

Just  as  in  the  previous  section,  the  key  step  in  the  proof  here  is  to 
compute  the  derivatives  with  respect  to  x of  the  forms  in  Table  7.  Again  this 
process  is  straightforward,  if  tedious,  for  cases  1,  2a,  and  2b,  but  case  2c 
presents  some  complications,  and  is  therefore  developed  in  detail.  Thus, 

• 2 ..T  .T.3  2 . ..T 

D{TiWj  + Vf-jTiVefj  + 7fjVeTil  -TiVxexfj  + ’exfjDfii}  + V{  Vf  j jT^tf  j 

+ VfjDfiilVgfj  + Vf jTiV{ Vef jT } (3.10) 

•T  .T 

+ VfjD{VeTi}  + VeTiV{Vfj}. 

But,  using  the  chain  rule, 

D(Ti}  -T^fj} 

d{T±}  - *T lv{ f j } (3.11) 

•T  • #T 

D{Vef  i}  - VeTi  V{  f j } - 

Substituting  (3.11)  into  (3.10)  yields  the  final  form  given  in  Table  9, 

which  also  contains  the  formulae  for  the  other  cases.  Here  too  it  is  easy  to 

see  that  the  same  inductive  argument  used  in  Theorem  2 results  in  a triadic 
3 

form  for  Vxexf  made  of  triads  of  the  form 

(a:aia2a3) , 

where 

ai  is  (n  x 1), 
a2  is  (r  x 1), 
a3  is  (n  x 1) , 

and  a is  a product  of  f actored-sequence  functions  and  first,  second,  and  third 
derivatives  of  the  single-variable  transformations.  In  this  case,  ai  and  a3 
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TABLE  9 


THIRD-ORDER  TENSORS  WITH  RESPECT  TO  x,  e,  AND  x OF  FACTORABLE 
FUNCTION  FORMS  IN  SENSITIVITY  ANALYSIS 


Rule 

fi 

3 

^xexf i 

1 

Xi 

Onxrxn 

2a 

fj  + fk 

3 3 
^xexf j + ^xex^k 

2b 

fj  * fk 

3 2 T 

fj^xex^k  + ^ex^kW^j}  + k^{  j } 

T 3 2 

+ VefjV{Vfk}  + fk^xex^j  + ^ex^j^l^k} 

T T 

+ VfjVfVgfk}  + VefkVfVfj} 

2c 

Tiffj.e] 

.3  2..  ..T 

Tl’xexfj  + VexfjTiv{fj}  + 7{7fj}Ti7efj 

• • • T • • T 

+ 7fj  Ti7{fj}7tfj  + 7f jT 7ef j } 

T T 

+ 7f  j 7eT  ^ 7{  f j } + 7tT17{7fj} 

are  gradients  with  respect  to  x of  f actored-sequence  functions  and  a2  is  the 
• gradient  with  respect  to  e of  a f actored-sequence  function  or  of  a first  or 
second  derivative  of  a Tf. 

It  is  instructive  to  note  that  the  result  in  this  section  could  also  be 

obtained  more  directly  by  utilizing  the  triadic  structure  exhibited  in  Section 
3 

3.4.2  for  Vexxf . Because  of  the  symmetric  nature  of  partial  derivatives,  that 

3 

array  and  the  one  of  this  section,  Vxexf , contain  the  same  entries,  in 

3 

different  arrangements.  In  fact,  Vxexf  can  be  obtained  from  the  former  by 

interchanging  the  first  two  vector  factors  of  each  triad  in  its  triadic 

3 

expansion,  thereby  exhibiting  the  triadic  form  for  Vxexf  immediately. 
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3 

3.4.4  The  Triadic  Form  of  V£exf 

This  is  the  last  case  to  be  considered,  and  also  the  most  difficult  of 
them.  As  in  the  previous  sections,  the  formulae  for  forms  1,  2a,  and  2b  are 
straightforward,  but  for  2c, 


2 T t .3  2 # # t 

De(il?exfj  + 7£jil7efj  + 7fj  7eTi}  * Ti.7ggxf  j + VexfjDe(Ti}  + 7g(7f  j jT^gf  j 


+ 7f jDe{Ti}7ef j + 7f jT^7e{ 7ef j } 


•T  .T 

+ 7fjDe{7eTl}  + 


• • • 

Both  De{Ti}  and  Dc {T ^ } were  calculated  In  (3«9)«  The  third  special 
computation  needed  is,  using  the  chain  rule  again, 

.T  .T 

Del^e^i}  * + ^{VgT^}. 

Substitution  gives  the  formulae  for  the  third  partials  in  Table  10. 

3 

The  inductive  argument  demonstrating  the  triadic  nature  of  Veexf  can  now 
go  through  as  before,  bearing  in  mind  the  assumption  in  (3.7)  that  the 
derivatives  with  respect  to  f of  the  single-variable  transformations,  T^,  used 
in  forming  the  factored  sequence  are  themselves  factorable  functions  in  e. 

This  is  required  since 

.T 

7e(7eTi} 


appears  in  the  last  term  of  Table  10.  Except  for  orientation  in  three-space, 

this  is  just  the  Hessian  with  respect  to  e of  the  function  T^ff j(x,e  ),e  ] . So 

0 

long  as  Ti  is  factorable  in  e , its  Hessian  is  a dyadic  in  gradients  with 

respect  to  e of  its  factored  sequence,  and  the  statements  regarding  the 

3 3 

triadic  nature  of  Veexf  go  through  as  before.  Thus  it  is  true  that  VeGxf  can 
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TABLE  10 


THIRD-ORDER  TENSORS  WITH  RESPECT  TO  x,  e,  AND  e OF  FACTORABLE 
FUNCTION  FORMS  IN  SENSITIVITY  ANALYSIS 


Rule 

fi 

3 

^eexf i 

1 

Xi 

Onxrxr 

2a 

f j + fk 

3 3 
^eexf j + ^eex^k 

2b 

f j ’ fk 

3 2 T 

fj^eex^k  + ^ex^k^effj}  + ^x^k^el^e^j} 

T 3 2 

+ VefjVe{Vxfk}  + fk^eex^j  + ^sx^ j^el^k} 

T T 

+ Vf  j Ve{Vef^}  + Vef]cVe{Vfj} 

2c 

.3  2 2 

Ti^cex^j  + ^exf  jTiVe{ f j } + VexfjVe{Ti} 

• • T • • • T 

+ 7*{  TfjjTiVefj  + »fj  Tive{*3}Vt£j 

T • • T 

+ vfj?e{Ti}vefj  + jTiVe{  Vef  j } 

• #T  «T  «T 

+ VfjVeTi  Ve{f j}  + Vf-jVe{VeTi}  + VeTiVe{Vfj} 

be  written  as  the  sum  of  triads  of  the  form 


where 

0 


(a:aia2a3) , 
a^  is  (n  x 1), 
a2  is  (r  x 1), 
33  is-  (r  x 1)  , 


and  a is  a product  of  factors  as  before,  with  additional  factors  included  from 
the  factored  sequences  for  the  T^'s.  The  last  item  of  note  is  that  here  ai  is 
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a gradient  with  respect  to  x of  a f actored-sequence  function  for  f(x),  and  ^ 
and  a3  are  gradients  with  respect  to  e of  a f actored-sequence  function  for 
f(x)  o£  a gradient  with  respect  to  e of  a derivative  of  a Tf,  0£  a gradient 
with  respect  to  e of  a f actored-sequence  function  or  transformation  for  one  of 
the  single-variable  transformations  in  the  factored  sequence  for  f(x). 

3.5  Array  Multiplication  with  Generalized  Outer  Product  Matrices 
In  the  previous  section,  the  polyadic  nature  of  the  arrays  in  (3.6)  was 
exhibited.  This  section  takes  up  the  notion  of  multiplying  these  generalized 
outer  product  matrices  by  Y =*  Vey,  which  appears  In  (3.6)  with  two  different 
three-space  orientations;  viz.,  Vey  which  is  (p  x r x 1)  and  Ve{y}  which  is 
(p  x 1 x r).  Multiplication  is  one  area  wherein  Factorable  Programming, 
through  the  natural  polyadic  nature  of  the  function  derivatives  involved, 
offers  a potentially  great  computational  saving  over  alternative  approaches. 

Consider  for  instance  the  case  of  multiplication  of  a dyad  and  a matrix, 

(a:ab)  * F * (aab^)F, 

where  a is  (n  x 1),  b is  (m  x 1),  F is  (m  x n)  and  a is  a scalar.  Of  course 
one  method  of  computing  this  is  to  form  (aa)b^  which  requires  n(m+l)  multipli- 
cations, then  to  multiply  the  result  by  F,  requiring  nm  malt ip li cat ions  and 
n(m-l)  additions.  A far  more  efficient  way,  however,  is  to  form  c = b^F, 
requiring  the  same  nm  mi  It ip li cat ions  and  n(m-l)  additions,  but  now  store  the 
result  in  the  dyadic  form  as 


(a:ac)  * aacT , 

thus  saving  the  n(m+l)  multiplications  required  to  form  the  dyad  explicitly. 
In  fact,  one  of  the  basic  tenets  of  Factorable  Programming  Is  that  certain 
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matrices  need  never  be  formed  explicitly,  since  all  required  calculations  can 
be  performed  with  the  dyadic  structures.  This  of  course  offers  a potentially 
great  computational  saving. 

It  only  needs  demonstrating,  therefore,  how  to  perform  the 
multiplications  in  (3.6).  Consider  then  multiplication  between  a generalized 
outer  product  matrix  of  order  3,  a triad,  and  a two-dimensional  matrix  in 
three-space.  Since  there  are  three  possible  orientations  for  a two- 
dimensional  matrix  in  three-space,  one  could  guess  that  there  are  six  ways  to 
perform  the  multiplication  depending  on  whether  the  matrix  is  pre-  or  post- 
multiplying  the  three  dimensional  array.  This  is  of  course  the  case,  and 
these  multiplications  are  illustrated  in  Figure  10,  where  in  each  case  the 
matrix  post-multiplies  each  similarly  oriented  matrix-slice  of  the  three- 
dimensional  array.  A similar  situation  obtains  for  pre-multiplication. 


FIGURE  10 

THREE  WAYS  TO  MULTIPLY  A THREE-DIMENSIONAL 
ARRAY  BY  A MATRIX 
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Jus t as  with  the  multiplication  of  a dyad  and  a matrix,  these  three 
multiplications  are  simplified  when  the  three-dimensional  array  is  stored  as  a 
triadic  and  it  is  desired  to  store  the  result  as  a triadic  also.  Let  one  of 
the  triads  be  (a:abc)  and  the  matrix  be  F,  and  consider  the  post- 
multiplication of  each  each  matrix-slice  by  F.  Then  the  result  of  the 
multiplications  in  Figure  10  is  just 


i.) 

(a:a[bTF]Tc) , 

ii.) 

(a:ab[cTF]T) , and 

iii.) 

(o:ab[cTF]T) . 

(That  which  appears  at  first  to  be  an  error  in  (iii)  is  in  fact  a result  of 
the  simple  fact  that  the  vector  in  the  third  position,  associated  with  the 
third  dimension,  is  involved  in  defining  two  sets  of  slabs:  one  set  with  the 

vector  a in  the  first  dimension,  and  one  set  with  the  vector  b in  the  second 
dimension). 

One  use  of  this  ability  to  multiply  polyads  and  matrices  is  in  further 
exploiting  the  polyadic  structure  of  the  matrices  in  (3.6).  It  has  been  shown 
(McCormick  (1983))  that  both  M and  in  (3.6)  can  be  written  in  dyadic  form. 
Let,  therefore,  -M”l  in  (3.6)  be  written  in  symmetric  dyadic  form  as 

-1  m T m 

“M  ■ l aiaiai  =»  l (aiiaiai). 
i-1  i-1 

In  the  unconstrained  case,  (3.6)  becomes 

2 -13 

Veex  * -M  7 LVe{x}Vex, 

3 

and  since  it  has  been  shown  that  V L is  triadic,  let 

3 n 

7 L » l (0j :bjbjbj ) . 

j-1 
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Then,  for  this  unconstrained  case, 


2 m m 

Veex  - l (<x±:a±a±)  l (3j rbjbjbj )Ve{x}Vex. 

i-1  j-1 


Application  of  (ii)  yields 


2 m m t 

Veex  “ I (ai-*aiai)  J (0 j ;bj  [bj ?ex]bj  )* * * * ve(x}  , 

i-1  j»l 


and  application  of  (iii)  yields 


2 m m T T 

veex  - I Cai: aiai ) J (3j : bj [bjVex] [bj Vex] ) . 
i-1  j-1 


Similar  rules  obtain  for  premultiplication  and  yield,  using  the  associative 
law  also. 


2mm  T T 

7eex  * l l (aiSj [aibj ] :ai [bj Vex] [bj Vex] ) , 
i-1  j-1 


the  efficiency  of  which  should  be  apparent.  As  an  aside,  if  it  were  desired, 
as  it  was  in  a recent  application  of  this  technique  to  a problem  for  the 
Department  of  Energy,  to  produce  each  frontal  slab  of  this  in  turn,  the  of 
these  is  given  by 


id  n T T 

l l (ai0j[aibj]aitic:  [bjVex]  [bjVex]), 
i-1  j-1 


where,  as  before,  a^  ^ denotes  the  kch  element  of  the  vector  a^ . 
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3»6  Parameter  Tensors  of  the  Optimal  Value  Function 


In  this  section,  formulae  are  developed  for  the  tensors  (through  order  3) 
of  the  optimal  value  function  f*(e)  = f[x(e),e]  of  problem  P(e)  given  in 
(3.1).  Armacost  and  Fiacco  (1975)  were  the  first  to  extend  basic  first-order 
results  for  the  right-hand-side  perturbation  problem  to  the  general  parametric 
problem  in  (3.1),  and  also  developed  the  second-order  results  given  below  in 
Theorem  8.  Fiacco  (1983)  gives  a complete  treatment  of  all  cases  for  all  the 
variations  of  (3.1).  Our  interest  is  in  providing  results  for  the  third-order 
tensor  of  f*(e)  and  we  begin  with  the  first  and  second-order  cases.  Theorem  8 
is  due  to  Armacost  and  Fiacco  (1975). 

Theorem  8.  (First-  and  Second-Order  Changes  in  f*(e)  for  P(e).) 

Let  x be  a feasible  point  for  P(e)  and.  assume  conditions  (i)  through 
(iv)  in  Theorem  6.  Then,  for  e in  a sufficiently  small  neighborhood  of  e 

a)  f *(e)  - L*. 

b)  Vef *(e)  - Ve  L 

m 

» Vef  - l uiVegi 
i-1 
T 

* Vef  - u Veg,  and 

2 2 2 

c)  Veef*(e)  - VygLVgy  + VggL 

2 T 2 

* ^xe^Vgx  - Vgg  Vgu  + VggL 


Proof . See  Fiacco  (1983). 
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It  is  easy  to  see  from  their  forms  and  from  the  results  given  in  the 

2 

previous  section  that  Vef*(e)  is  monadic  and  V£gf*(e)  is  dyadic.  The  result 
for  the  third-order  tensor  of  f*(e)  is  given  in  the  following. 

Theorem  9.  (Third-Order  Changes  in  f*(e)  for  P(e).) 

If  the  conditions  of  Theorem  8 hold  and  all  third-order  partial  deriva- 
tives in  x and  third-order  partial  derivatives  in  x and  e exist  and  are  con- 
tinuous in  x and  e in  a neighborhood  of  (x,e),  then 

3 2 2 3 3 3 

Veee  f*(e)  » Vy£L7eey  + VyyeLVe{yj'Vey  + 7£ygLV£y  + Vg££L  + Vy££LV£y 

2 2 3 3 3 

“ VxeLVeex  + ^xxeL^e{x}  Vex+  V^gLVgj  u}  Vex  + 7£x£LV£x 

T 2 T T 3 

“ veeu  ” }Ve{x}V£u  - Vg{veg  }Veu  + VegeL, 

3 

and  Vgggf*(e)  is  a triadic. 

Proof . Straightforward  differentiation  of  (c)  in  Theorem  8 gives  the  formula 
3 

for  Vgggf*(e).  The  proof  of  its  triadic  structure  is  also  straightforward 
using  the  results  of  the  previous  sections. 

3.7  Second-Order  Sensitivity  Results  in  Use 
The  direct  use  of  first-  and  second-order  sensitivity  results  is  in 
estimating  the  solution  and  nultiplier  vectors  for  P(e)  as  the  problem  is 
perturbed  away  from  P(e).  This  estimation  is  done  using  the  Taylor  series 
approximations  to  two  and  three  terms: 

y(e)  - y(e ) + V£y(£)(e-<0,  (3.12) 

y(e)  - y(e ) + V£y(e)(e-e)  + j Vg£y(e)(e-£)(e-e),  (3.13) 
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where  the  multiplication  in  the  third  term  on  the  rhs  of  (3.13)  is  understood 

2 

to  be  inner  product  multiplication  that  reduces  the  dimension  of  Veey. 

To  illustrate  this  idea  as  well  as  some  of  the  other  ideas  in  this 
section,  consider  the  following  parameterized  nonlinear  programming  problem: 


minimize  f(x,e)  =*  x^x  + e^exp(£2a^x) . 
x e Rn 

A A 

At  e ■ 0,  the  solution  by  inspection  is  at  x a 0.  Since  L(x,u,e)  =»  f(x,e) 
for  this  problem,  the  first-order  sensitivity  equation,  Y * reduces  to 


2-12 

Vex(e)  - -(V  f)  Vexf. 

Also 


T 

Vf  * 2x  + [eiS2exp(£2a  x)]a, 

^f  ■ 21  + a[eiS2exp(£2a^x) ]a^ , 

and 

V^xf  - a[exp(£2aTx)]  [£2,  £l£2aTx  + £ l ] - 

Since  (2I)"1  - |l,  (^f)"1  can  readily  be  obtained  using  the  Sherman- 
Woodbury-Morrison  formula  (see  McCormick  (1983),  p.  70)  that  gives  the 
inverse  of  a matrix  perturbed  by  a dyad.  This  formula  is 

(A  + ucv^ ) ^“A  A ^u[c( 1 + v^A  ^ uc ) ^ ]v^A  ^ . 


Using  this,  and  letting  c » £i£2exp(£2a  x) , 


(^f)"1-  (21  + acaT)_1=  jl 


r/4  , « T >”1 | T 
al(-  + 2a  a)  Ja  . 
c 


51 


Evaluating  these  at  x ■ 0 and  £ = 0 yields 


and 


Therefore 


vjxf  . 0. 

Vsx(e)  = 0, 


and  the  first-order  sensitivity  analysis  approximation  (3.12)  gives  no 
additional  information  as  the  problem  is  perturbed  away  from  x » 0.  This 
problem  is  ideally  suited  then  for  a second-order  sensitivity  analysis  using 
equations  (3.13)  and  (3.6).  But  since  V£x(e)=0,  (3.6)  reduces  to 

v£ex(e)  - -(V2f )-iVgexf . 

Using  the  formulae  developed  in  the  proof  of  Theorem  3, 

■ Ve(exp[e2aTx] :a[0,  £i£2aTx]T)  + Ve(exp [£2aTx] :a[£2,  e l ]T ) 

, r T , , T ,T , T ,Tn 

* (exp[£2a  x]:a[£2,  e^a  x]  [0,  a x]  ) 

+ (exp[£2aTx]  :a(0,aTx]T  [£2,ei]T) 

+ (exp[£2a^x]  :a[£2,  £le2a^x]^t0,  a^x]^) 

+ (exp [£2aTx]  :a[0, 1 ]T  [ 1 ,0]T) 

+ (exp[£2aTx] :a[l,0]T [0,1 ]T). 


73  f 
v£EXr 
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Notice  the  triadic  nature  of  this  tensor  and  the  frequency  of  occurrence  of 
certain  terms.  Evaluating  these  at  x = 0 and  e * 0 yields 

- a :ae2ej)  + (l:aeie2)> 

where  e^  is  the  ictl  unit  vector  in  Rn.  The  second  partial  derivatives  with 
respect  to  e of  the  solution  vector  are  given  by  — - 

vjex(e)  = -(^fCx))  ^gexf (x) 

■ - jl[(l:ae2ei)  + (l:aeie2)] 

■ (-  j:ae2ei)  + (-  |:aeie2)« 

Substituting  this  into  (3.13)  gives  the  second-order  estimation 

x(e)  - x(e)  + Vex(e)(£-e)  + jVeex(e)(e-£)(e-e), 
which  at  e * 0 becomes 

x(e)  - x(0)  + Vex(0)e  + jV§ex(0)ee. 

Since  x(0)  is  the  solution  to  the  original  problem,  x(0)  = x = 0.  Further- 
more, V£x(0)  was  shown  above  to  vanish  also.  Thus 

‘ x(e)  « |v^ex(0)ee 

* |[(-  |:ae2ei)  + (-  |:aeie2)]ee 
■ (-  4£2ei:a)  + (- 

- (-  |eie2:a).  (3.14) 
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For  a more  concrete  example  of  this  technique,  let  a = (1,2)^  and  perturb 
the  problem  by  e = (-1.902,  .1)T.  The  solution  to  this  new  problem  is  calcu- 
lated in  Jackson  and  McCormick  (1984)  using  Halley's  third-order  method  of 
tangent  hyperbolas , and  is  (.1,  .2)T.  However,  an  estimate  is  given  by  (3.14), 
without  having  to  solve  the  new  nonlinear  programming  problem.  The  approxima- 
tion is 

x(e)  - (-  j(-l .902) ( .1) : [1,2]T) 

- (.0951,  .1902)t, 

which  of  course  is  much  better  than  the  first-order  approximation,  x(e)  = 0. 

Another  use  of  the  second-order  sensitivity  formulae  is  in  solving 
implicitly  defined  optimization  problems.  Consider  for  example  the  optimiza- 
tion problem 

max  F(p,q)  - x*(p,q)  + y*(p,q)  + pq 
(p,q) 

where  (x*,y*)  is  defined  implicitly  as  the  solution  of 

min  G(x,y)  - (x  - y + 3pq)2  + (x  - (p  - q)2)2. 

(x,y) 

The  analytic  solution  of  this  problem  is  easily  obtained  as 

x*(p,q)  - (p  - q)2, 
y*(p»q)  ■ (p  - q)  ■ (p  - q)2  + 3pq* 

Substituting,  the  other  problem  becomes 

max  2p2  + 2q2, 

(p,q) 

with  solution  (p*,q*)  * (0,0). 

Let  e = (pjq)3-*  The  solution  of  the  maximization  problem  will  be  accom- 
plished in  one  iteration  of  Newton's  method  using  the  second-order  sensitivity 
formulas . 
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It  is  required  to  compute 


Now 


And 


2 -1 

£o~t^ee  F(e0)]  VeF(So) 


VeF  - 7ex*  + Vy*  + [q,p]T» 


2 2 2 
veeF  * 7eex*  + 7eey*  + 


0 1 
1 0 


Let  z denote  [x,y]T«  Then  from  first-order  sensitivity  analysis 

2-1 

VgZ*  =■  — (7ggG)  VezG  • 


Now 


and 


2 

7zzG 


1 

1 

(2) [1,-1]  + 

-1 

0 

(2) [1 ,0] 


(4)(pq) [1,-1] . 


2 -1 

The  most  natural  representation  of  (VZZG)  is  in  dyadic  form  and  is 


2 

1 

1 

7ezG  “ 

-1 

(2) [3q,3p]  + 

0 

0 

1 

(l/2)[0,-l]  + 

-1 

1 

(1/2)[1,1] 


The  second-order  sensitivity  formula  (3.6)  can  be  rewritten  (conceptu- 
ally) in  this  case  as 


2 2 -1  r 3 3 23  3 

Veez*  = -(VggG)  [ (7ezzG)  (7ez*)  + (7zzz^)  (7gz*)  + VgezG  + (7zgzG)  ( Vez* ) ] 


For  this  problem  the  only  term  multiplying  the  inverse  which  does  not  vanish 
3 

is  VggzG,  a (2  x 2 x 2)  matrix. 
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Its  triadic  form  is 


7 eezG  = (6:[1,-1]t[1,0]T[0,1]T)  + (6:  [ 1 ,-l]T  [0, 1 ]T  [ 1 ,0]T) 
+ (-4:[l,0]T[l,-l]T[i,-i]T). 

Suppose  [p0»qo]  - [2,1].  Then 


VcZ 


e^o 


0 

1 

1 

» - 

(l/2)[0,-l]  + 

(1/2) [1,1]  + 

-1 

1 

-1 

(2)  [3,6] 


(4) [1,-1] . 


Therefore 


and  thus 


vex0*  - (2,-2]T,  7ey0*  =■  [5,41*2, 

VeF0  - [2,-2]*  + [5,4]T  + [1,21*2  - [8,4]T. 

The  second-order  sensitivity  formulas  yield 

-[7zzGf1[7ezzG]  - -[(1/2:  [0,-1 1*2  [0,-1 1*2)  + (1/2 : [ 1,1  ]T  [ 1 ,1 1*2)  ] 

* [(6:[1, -11*2(1, QJT[0,1]T)  + (6:[1,-1]T[0,1I*2[1,01T) 

+ (-4:  [ 1,01*2  [ 1 ,-l]T  [ i ,-l  JT)  ] . 


To  illustrate  the  computation,  one  of  the  six  product  terms  will  be  computed 


(-1/2: [0,1]T[0,-1]T)  * (6:[1,-1]T[1,0]T[0,1]T) 

- (— (1)( 1/2) (6) [0,-1] [1,-1]T:[0,-1]T[1,0]T[0,1]T) 

- (-3:  [0,1]T[i,0]T[0,i]T. 

U 

In  all,  the  resulting  triadic  form  has  the  following  terras: 

(-3:[0, -1]*2(1, 01*2[0, 11*2) 

+ (-3:(0,-1]*2[0,1]T[1,0]T) 

+ (2:[l,l]T[l>-l]T[l,-UT). 
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From  this. 


and 

2 

^eeyo*  “ 

Thus 

Combining , 

eo 

as  desired. 


7eex0*  - (2:[1,-1]T[1,-1]T) 

(2:[1,-1)T[1,-1JT)  + (3: [0,1]T [ 1 ,0]T)  + (3: [ 1 ,0 ]T [ 0 , 1 ]T ) 


2 2 2 

0 1 

4 0 

veeFo  * veexo*  + ^eeyo*  + 

1 0 

3 

0 4 

2 -1 

- (VeeF0)  (7eF0) 


2 

1 


4 


0 4 


4 


0 

0 


0 
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